Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis

نویسندگان

  • Tino Haderlein
  • Cornelia Schwemmle
  • Michael Döllinger
  • Václav Matousek
  • Martin Ptok
  • Elmar Nöth
چکیده

Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 connected speech samples (43 women and 15 men; 48.7 ± 17.8 years) containing the German version of the text "The North Wind and the Sun" were evaluated perceptually by 19 speech and voice therapy students according to the RBH scale. For the human-machine correlation, Support Vector Regression with measurements of the vocal fold cycle irregularities (CFx) and the closed phases of vocal fold vibration (CQx) of the Laryngograph and 33 features from a prosodic analysis module were used to model the listeners' ratings. The best human-machine results for roughness were obtained from a combination of six prosodic features and CFx (r = 0.71, ρ = 0.57). These correlations were approximately the same as the interrater agreement among human raters (r = 0.65, ρ = 0.61). CQx was one of the substantial features of the hoarseness model. For hoarseness and breathiness, the human-machine agreement was substantially lower. Nevertheless, the automatic analysis method can serve as the basis for a meaningful objective support for perceptual analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic evaluation of tracheoesophageal substitute voice: sustained vowel versus standard text.

OBJECTIVE The Hoarseness Diagram, a program for voice quality analysis used in German-speaking countries, was compared with an automatic speech recognition system with a module for prosodic analysis. The latter computed prosodic features on the basis of a text recording. We examined whether voice analysis of sustained vowels and text analysis correlate in tracheoesophageal speakers. PATIENTS ...

متن کامل

Automatic Rating of Hoarseness by Text-based Cepstral and Prosodic Evaluation

The standard for the analysis of distorted voices is perceptual rating of read-out texts or spontaneous speech. Automatic voice evaluation, however, is usually done on stable sections of sustained vowels. In this paper, text-based and established vowel-based analysis are compared with respect to their ability to measure hoarseness and its subclasses. 73 hoarse patients (48.3± 16.8 years) uttere...

متن کامل

Voice Analysis in English and Persian Persuasive Texts: Pedagogical implications in focus

The main purpose of this study is to investigate how voice is realized by Iranian EFL learners in persuasive English and Persian text types. This discourse-related notion is a required criterion for writing acceptable English. However, L2 learners from cultures other than English might face problems in realizing it, or even ignore it all through their writing. In this connection, the present st...

متن کامل

Voice Analysis in English and Persian Persuasive Texts: Pedagogical implications in focus

The main purpose of this study is to investigate how voice is realized by Iranian EFL learners in persuasive English and Persian text types. This discourse-related notion is a required criterion for writing acceptable English. However, L2 learners from cultures other than English might face problems in realizing it, or even ignore it all through their writing. In this connection, the present st...

متن کامل

Intelligibility Rating with Automatic Speech Recognition, Prosodic, and Cepstral Evaluation

For voice rehabilitation, speech intelligibility is an important criterion. Automatic evaluation of intelligibility has been shown to be successful for automatic speech recognition methods combined with prosodic analysis. In this paper, this method is extended by using measures based on the Cepstral Peak Prominence (CPP). 73 hoarse patients (48.3± 16.8 years) uttered the vowel /e/ and read the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2015  شماره 

صفحات  -

تاریخ انتشار 2015